64 research outputs found

    Designing optimal- and fast-on-average pattern matching algorithms

    Full text link
    Given a pattern ww and a text tt, the speed of a pattern matching algorithm over tt with regard to ww, is the ratio of the length of tt to the number of text accesses performed to search ww into tt. We first propose a general method for computing the limit of the expected speed of pattern matching algorithms, with regard to ww, over iid texts. Next, we show how to determine the greatest speed which can be achieved among a large class of algorithms, altogether with an algorithm running this speed. Since the complexity of this determination make it impossible to deal with patterns of length greater than 4, we propose a polynomial heuristic. Finally, our approaches are compared with 9 pre-existing pattern matching algorithms from both a theoretical and a practical point of view, i.e. both in terms of limit expected speed on iid texts, and in terms of observed average speed on real data. In all cases, the pre-existing algorithms are outperformed

    ModClust: a Cytoscape plugin for modularity-based clustering of networks

    Get PDF
    National audienceLarge networks such as protein-protein interaction networks are usually extremely difficult to understand as a whole. We developed ModClust, a Cytoscape plugin for modularity-based clustering of large networks. The aim of this plugin is first to establish classes of high density edges. It also allows to understand the relations between these classes, and how they are assembled within the whole graph. It can be used to predict new protein functions. It implements two novel algorithms: FT and TFit. Their results are compared both on random graphs and on benchmarks where the optimal partition is known. RÉSUMÉ. Les grands graphes, comme les rĂ©seaux d'interaction protĂ©ine-protĂ©ine, sont d'une maniĂšre gĂ©nĂ©rale difficiles Ă  analyser. Nous avons dĂ©veloppĂ© un plugin pour le logiciel Cy-toscape, appelĂ© ModClust, effectuant du partitionnement de graphes par optimisation de la modularitĂ©. L'objectif de ce plugin est de comprendre quelles sont les relations entre classes et comment ces derniĂšres sont assemblĂ©es dans le graphe. Il nous aide finalement Ă  prĂ©dire de nouvelles fonctions protĂ©iques. Deux nouveaux algorithmes, FT et TFit, sont implĂ©mentĂ©s. Leurs rĂ©sultats sont comparĂ©s sur des graphes alĂ©atoires et sur des benchmarks dont on connait les partitions optimales

    Local Similarity Between Quotiented Ordered Trees

    No full text
    International audienceIn this paper we propose a dynamic programming algorithm to evaluate local similarity between ordered quotiented trees using a constrained edit scoring scheme. A quotiented tree is a tree defined with an additional equivalent relation on vertices and such that the quotient graph is also a tree. The core of the method relies on two adaptations of an algorithm proposed by Zhang et al. [K. Zhang, D. Shasha, Simple fast algorithms for the editing distance between trees and related problems (1989) 1245-1262] for comparing ordered rooted trees. After some preliminary definitions and the description of this tree edit algorithm, we propose extensions to globally and locally compare two quotiented trees. This last method allows to find the region in each tree with the highest similarity. Algorithms are currently being used in genomic analysis to evaluate variability between RNA secondary structures

    Effects of chilling on the expression of ethylene biosynthetic genes in Passe-Crassane pear (Pyrus communis L.) fruits

    Get PDF
    Passe-Crassane pears require a 3-month chilling treatment at 0 C to be able to produce ethylene and ripen autonomously after subsequent rewarming. The chilling treatment strongly stimulated ACC oxidase activity, and to a lesser extent ACC synthase activity. At the same time, the levels of mRNAs hybridizing to ACC synthase and ACC oxidase probes increased dramatically. Fruit stored at 18 C immediately after harvest did not exhibit any of these changes, while fruit that had been previously chilled exhibited a burst of ethylene production associated with high activity of ACC oxidase and ACC synthase upon rewarming. ACC oxidase mRNA strongly accumulated in rewarmed fruits, while ACC synthase mRNA level decreased. The chilling-induced accumulation of ACC synthase and ACC oxidase transcripts was strongly reduced when ethylene action was blocked during chilling with 1-methylcyclopropene (1-MCP). Upon rewarming ACC synthase and ACC oxidase transcripts rapidly disappeared in 1-MCP-treated fruits. A five-week treatment of non-chilled fruits with the ethylene analog propylene led to increased expression of ACC oxidase and to ripening. However, ethylene synthesis, ACC synthase activity and ACC synthasemRNAs remained at very lowlevel. Our data indicate thatACC synthase gene expression is regulated by ethylene only during, or after chilling treatment, while ACC oxidase gene expression can be induced separately by either chilling or ethylene

    About the largest subtree common to several X-trees

    Get PDF
    Étant donnĂ©s plusieursX-arbres, ou arbres phylogĂ©nĂ©tiques, sur le mĂȘme ensembleX, nous cherchons Ă  construire un plus grand sous-ensembleY⊂Xtel que les arbres partiels induits surYsoient identiques d’un point de vue topologique, c’est-Ă -dire indĂ©pendamment des longueurs des arĂȘtes. Ce problĂšme, connu sous le nom de MAST (Maximum Agreement SubTree), est NP-Difficile, dans le cas gĂ©nĂ©ral, dĂšs que le nombre deX-arbres est supĂ©rieur Ă  2. Nous prĂ©sentons un algorithme approchĂ© qui construit un arbre partiel commun maximal. Il est facilement programmable et suffisamment efficace sur une centaine deX-arbres connectant une centaine d’élĂ©ments pour Ă©valuer la taille moyenne d’un sous-arbre commun Ă  desX-arbres indĂ©pendants. La distribution observĂ©e permet d’estimer la taille critique d’un sous-arbre commun et de mesurer la congruence de plusieurs arbres Ă©volutifs.Given severalX-trees or unrooted phylogenetic trees on the same set of taxaX, we look for a largest subsetY⊂Xsuch that al l the partial trees reduced byYare topologically identical. This common subtree is called a MAST for Maximum Agreement SubTree. The problem has polynomial complexity when there are only two trees but generally it is NP-hard for more than two. We introduce a polynomial approximation algorithm for the multiple case, which is easy to implement, very efficient and which produces a maximal common subtree. It begins with the computation of an upper bound for its size and designates elements inXthat cannot belong to a common subtree of a given size. Simulations on random and real data have shown that this heuristic often provides an optimal solution as soon as the number of trees is larger than 5. Then, we develop a statistical study to evaluate the average size of a MAST corresponding to independent trees. The computed distribution allows to estimate the critical size of a MAST to reveal some congruence between trees

    SimCT: a generic tool to visualize ontology-based relationships for biological objects

    Get PDF
    Summary: We present a web-based service, SimCT, which allows to graphically display the relationships between biological objects (e.g. genes or proteins) based on their annotations to a biomedical ontology. The result is presented as a tree of these objects, which can be viewed and explored through a specific java applet designed to highlight relevant features. Unlike the numerous tools that search for overrepresented terms, SimCT draws a simplified representation of biological terms present in the set of objects, and can be applied to any ontology for which annotation data is available. Being web-based, it does not require prior installation, and provides an intuitive, easy-to-use service

    Clust&See: A Cytoscape plugin for the identification, visualization and manipulation of network clusters

    Get PDF
    International audienceBackground and scope Large networks, such as protein interaction networks, are extremely difficult to analyze as a whole. We developed Clust&See, a Cytoscape plugin dedicated to the identification, visualization and analysis of clusters extracted from such networks. Implementation and performance Clust&See provides the ability to apply three different, recently developed graph clustering algorithms to networks and to visualize: (i) the obtained partition as a quotient graph in which nodes correspond to clusters and (ii) the obtained clusters as their corresponding subnetworks. Importantly, tools for investigating the relationships between clusters and vertices as well as their organization within the whole graph are supplied

    A Comprehensive Analysis of Gene Expression Changes Provoked by Bacterial and Fungal Infection in C. elegans

    Get PDF
    While Caenorhabditis elegans specifically responds to infection by the up-regulation of certain genes, distinct pathogens trigger the expression of a common set of genes. We applied new methods to conduct a comprehensive and comparative study of the transcriptional response of C. elegans to bacterial and fungal infection. Using tiling arrays and/or RNA-sequencing, we have characterized the genome-wide transcriptional changes that underlie the host's response to infection by three bacterial (Serratia marcescens, Enterococcus faecalis and otorhabdus luminescens) and two fungal pathogens (Drechmeria coniospora and Harposporium sp.). We developed a flexible tool, the WormBase Converter (available at http://wormbasemanager.sourceforge.net/), to allow cross-study comparisons. The new data sets provided more extensive lists of differentially regulated genes than previous studies. Annotation analysis confirmed that genes commonly up-regulated by bacterial infections are related to stress responses. We found substantial overlaps between the genes regulated upon intestinal infection by the bacterial pathogens and Harposporium, and between those regulated by Harposporium and D. coniospora, which infects the epidermis. Among the fungus-regulated genes, there was a significant bias towards genes that are evolving rapidly and potentially encode small proteins. The results obtained using new methods reveal that the response to infection in C. elegans is determined by the nature of the pathogen, the site of infection and the physiological imbalance provoked by infection. They form the basis for future functional dissection of innate immune signaling. Finally, we also propose alternative methods to identify differentially regulated genes that take into account the greater variability in lowly expressed genes
    • 

    corecore